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Technical  Univesrity  of  Crete  Technical  University  of  Crete 

and  Columbia  University 


ABSTRACT 

In  stochastic  scheduling  and  optimal  maintenance  problems  that  have 
been  considered  in  the  literature,  the  optimization  criterion  used  has 
often  been  equivalent  to  minimizing  the  expected  first  passage  times  to  a 
set  of  states.  A  typical  method  used  in  establishing  the  optimality  of  a 
certain  policy  is  the  method  of  successive  approximations  on  the 
appropriate  dynamic  programming  functional  equations.  As  an 
intermediate  result,  this  technique  often  involves  the  optimality  of  the 
pertinent  policy  for  all  finite  horizon  versions  of  the  problem.  In  this 
paper  we  characterize  stochastically  optimal  policies  as  policies  that 
posess  a  similar  property,  i.e.  they  sure  optimal  in  expectation  for  all 
members  of  a  sequence  of  appropriately  defined  finite  horizon  problems. 
We  use  this  characterization  to  establish  the  stochastic  optimality  of 
relevant  policies  for  the  optimal  repair  allocation  for  a  series  system 
problem  and  for  a  scheluling  problem. 


1.  Introduction.  In  many  problems  that  have  been  considered  in  the  litera¬ 
ture  of  stochastic  scheduling  and  maintenance,  the  optimization  criterion 
employed  has  often  been  to  minimize  the  expected  first  passage  times  to  a  set 
of  "desirable”  states.  A  typical  method  used  in  establishing  the  optimality  of  a 
certain  policy  is  the  method  of  successive  approximations  on  the  appropriate 
dynamic  programming  functional  equations.  As  an  intermediate  result,  this 
technique  often  involves  the  optimality  of  the  pertinent  policy  for  all  finite 
horizon  versions  of  the  problem. 

*This  research  was  partially  supported  by  the  NSF  under  Grant  NO. 
DMS-84-05413  and  the  AFOSR  under  contract  afosr  87-0072 


In  this  paper  we  show  that  the  stochastic  optimality  cr  a  policy  can  be 
obtained  in  a  similar  manner  by  establishing-  that  the  policy  is  optimal  for  a 
class  of  appropriately  defined  finite  horizon  problems.  Furthermore  this 
approach  can  be  used  to  establish  that  a  stochastically  optimal  policy  does  not 
exist.  It  is  known  that  stochastic  optimality  is  the  strongest  optimization 
criterion  since  it  implies  optimality  under  both  the  expected  and  the 
discounted  first  passage  time  criteria. 

In  this  paper  we  examine  two  cases  of  application  of  the  property  above. 

We  first  consider  a  generalization  of  the  problem  of  optimal  allocation  over 

time  of  a  single  repairman  to  failed  components  of  a  series  system,  previously 

considered  in  Katehakis  and  Derman  (1984);  see  this  paper  for  references  on 

other  work  on  this  problem.  Operation  in  a  varying  enviroment  is  considered 

and  the  following  assumptions  are  made.  Let  N  denote  the  number  of 

components  in  the  system  .  Let  0  denote  the  state  of  the  environment 

which  is  observable  and  let  0  denote  the  set  of  all  possible  states  of  the 

environment;  0  is  assumed  to  be  (for  simplicity)  finite.  Furthermore,  we 

model  the  law  of  motion  for  the  state  of  the  environment  by  a  continuous  lime 

Markov  Chain  (0(t)  ,  t  *  0}  with  known  transition  rates  (q(0'/0),  0,0'  e  0}  . 

Components  may  be  either  in  a  functioning  or  in  a  failed  state.  To  model  the 

effect  of  the  operating  environment  on  the  time  to  failure  of  the  components, 

we  assume  that  when  the  state  of  the  environment  is  0  the  failure  time  of 

the  component  is  an  exponentially  distributed  random  variable  with 

known  rate  m(0)  (l*i*N,0€0).  In  addition  we  assume  that  the  fj 

□ 

failure  time  of  any  component  is  independent  of  the  state  of  other  components. 

The  time  required  to  repair  component  i  is  also  an  exponentially  distributed 
random  variable  the  rate  \j  of  which  is  independent  of  the  state  of  the 
environment.  Repaired  components  are  as  good  as  new.  It  is  assumed  that  it 
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is  possible  to  reassign  the  repairman  among'  failed  components  instantaneously. 

In  this  paper  it  is  shown  that  the  policy  which  always  assigns  the 

repairman  to  the  failed  component  with  the  smallest  failure  rate  among  the 
failed  ones  (smallest  failure  rate  first  or  SFR  policy)  minimizes 

stochastically  the  time  the  system  spends  under  repair.  Hence,  it  is  optimal 
with  respect  to  both  the  maximum  availability  and  the  maximum  discounted 
operation  time  criteria,  irrespective  of  the  values  of  the  repair  rates  and  the 
discount  rate. 

In  Katehakis  and  Herman  (1984),  the  optimality  of  the  SFR  policy  with 
respect  to  the  average  system  operation  time  criterion  was  obtained  in  the 
case  of  non-varying  enviroment.  The  proof  involved  establishing  that  the  SFR 
policy  minimized  the  expected  first  passage  times  to  the  functioning  state;  this 
was  done  by  showing  that  the  functional  equations  of  the  relevant  Markovian 
decision  problem  were  valid  under  this  policy.  The  method  of  proof  then  was 
based  on  induction  on  the  number  of  the  components.  The  present  proof, 

based  on  propositions  1  and  2  below,  although  simpler  than  the  original, 

establishes  optimality  under  a  stronger  criterion  for  a  more  general  model. 

We  also  consider  the  following  stochastic  scheduling  problem  examined  by 
Van  der  Heyden  (1981);  see  also  Weiss  and  Pinedo  (1982).  Jobs  arrive 
according  to  a  Poisson  process  with  rate  r  .  The  processing  time  of  a  job  is 
an  exponentially  distributed  random  variable  with  a  parameter  that  is  chosen 
upon  arrival  by  sampling  from  a  known  distribution  F.  All  random  variables 
are  assumed  to  be  independent.  There  are  p  processors  and  the  objective 
is  to  minimize  the  expected  time  until  all  jobs  have  been  completed  (also  called 
the  expected  makespan).  It  has  been  established,  in  the  papers  above,  that 
the  policy  which  always  assigns  processors  to  the  uncompleted  jobs  with  the 
Longest  Expected  Processing  Times  (LEPT  policy)  minimizes  the  expected 
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makespan.  We  modify  Van  der  Heyden’s  work  and  obtain  the  stochastic 


optimality  of  the  LEPT  policy. 


2.  Characterization  of  S  toe  hastically  Optimal  Policies  in  First  Passage 


Problems. 


We  first  consider  the  discrete  time  first  passage  problem  in  Markovian 
Decision  Theory  on  a  (for  simplicity)  countable  state  space,  which  is  specified 
by  the  following  elements. 

1.  The  state  space  S  and  the  action  sets  A(s)  ,  s  e  S  , 

2.  The  transition  law:  (p(s'/s,a)  ,  s',s  £  S  ,  a  £  A(s)}  , 

3.  A  subset  So  of  S  , 

4.  An  initial  state  so  . 

We  will  denote  this  problem  by  (n<j)  .  A  policy  tt  generates  a  stochastic 
process  (Xn(k)  ,  k  =  1,2,...}  .  The  first  passage  time  from  a  state  s  to  So 
will  be  denoted  by  T^s)  . 

A  policy  is  called  stochastically  optimal  if  it  satisfies 


(1)  T^gfs)  $  Tjjfs)  for  all  alternative  policies  n  ,  s  €  S 


where,  given  two  random  variables  ,  Yo  ,  define: 


(2)  Yj  $  Y2  if  and  only  if  PfY^  i  y)  i  P(Y2  £  y)  . 

The  method  we  use  to  show  the  stochastic  optimality  of  a  policy  and  to 
discover  whether  such  a  policy  exists  is  based  on  establishing  the  optimality 
of  the  policy  in  the  following  class  of  finite  horizon  problems. 

We  define  the  finite  horizon  proolem  (lln)  ,  n  il  ,  as  follows. 

1.  State  space  Sn  =  ff'sira)  ,  s  €  S  ,  m  =  0,1,..., n)  . 

2.  Action  sets:  A(s:m)  =  A;si)  m  -  1 . n  ,  A(s:0)  =  0 


3.  Transition  law: 

pCs'/s, a)  if  m'  =  m  -  1  and  s  ,  s'  e  S\Sq 

(3)  p(s';m'/(s;m),a)  =  1  if  m'=m-l  and  s  ,  s'  e  Sq 

0  otherwise  . 

4.  Reward  structure: 

1  if  s  e  So  and  m  =  0 

( 4 )  r  ( s :  m )  = 

0  otherwise  . 

Every  policy  in  the  process  (11)  induces  a  policy  refering  to  the  family 
of  processes  { ( nn )  ,  n  $  1}  which  does  not  depend  on  n  and  vise  versa. 
Thus,  there  is  a  1-1  correspondence  between  policies  associated  with  the 
problem  <ri)  and  policies  refering  to  the  family  of  problems  {(Iln)}  which  do 
not  depend  on  n  . 

The  following  can  be  easily  established. 

Proposition  1.  A  policy  is  stochastically  optimal  in  (11)  if  and  only  if  it  is 
optimal  in  (Iln)  for  all  nil. 

Proof :  It  suffices  to  notice  that  in  n  steps  the  process  either  terminates 

in  the  set  of  states  {(s;0)  :  s  €  So)  with  reward  1  or  to  some 
other  state  with  reward  0  .  Thus,  the  total  expected  reward  in  (nn) 
coresponding  to  any  policy  tt  defined  in  fO)  and  initial  state  :s,n)  is 
PrTTr(s)  i  n]  . 

For  a  continuous  time  first  passage  problem  the  approach  described  above 
is  applicable  if  we  can  use  the  device  of  uniformization;  see  Jensen(1953), 
Veinott(  1969)  and  Lipman(  1975).  To  be  precise,  the  continuous  time  problem, 
which  is  denoted  by  (nc)  ,  is  specified  by: 

1.  The  state  space  S  and  the  action  sets  A(s)  ,  s  e  S  , 


2.  The  transition  law,  which  is  given  in  terms  of  transition  rates 


{ v(s'/s,a)  ,  s'  ,  s  e  S  ,  a  e  A(s))  , 

3.  A  subset  So  of  S  , 

4.  An  initial  state  sq  . 


Let  us  define 

(5)  v('s,a)  =  I  ,v(s'/s,a) 

s 


The  device  of  uniformization  essentially  involves  to  notice  that  if  the 
transition  rates  are  bounded  and  if  we  consider  v  4  v(s,a)  4  s,  a  ,  then, 
by  counting  (dummy)  transitions  back  to  state  s  at  a  rate  (v  -  v(s,a))  the 
sojourn  times  in  all  states  are  equalized,  i.e.,  they  are  i.i.d.  exponentially 
distributed  random  variables  with  rate  v  .  Thus,  a  discrete  first  passage 
problem  is  defined  on  the  same  state  and  action  spaces  with  transition 
probabilities  given  by: 


(6) 


P( 


s' /s, a 


v(s'/s,a)/v  if  s  1  s' 

(v  -  v(s'/s,a))/v  if  s  =  s' 


Q 

Let  T^fs)  denote  the  first  passage  time  from  state  s  to  the  subset  So 
for  the  original  continuous  time  process  and  let  T^(s)  denote  the  first 
passage  time  from  s  to  So  for  the  discrete  time  process  above  (note  that 
without  any  loss  in  generality  we  can  assume  that  v  =  1  and  thus  regard 
Tf((s)  as  a  random  variable  counting  the  number  of  transitions  to  So)  • 

We  formally  state  the  following, 

Proposition  2.  A  policy  it®  is  stochastically  optimal  for  the  continuous  time 
problem  if  and  only  if  the  actions  prescribed  by  n®  constitute  a 
stochastically  optimal  policy  in  the  discrete  time  problem. 

Proof:  It  is  known,  Keilson(1979),  that  the  uniformized  process  is 


probabilistically  identical  to  the  original  process:  thus  we  can  assume  that 


Tfl-fs)  refers  to  the  later.  Let  Y].,  Y2,...  denote  the  sequence  of  i.i.d. 
sojourn  times  in  the  continuous  time  uniformized  process.  Then, 

Tnfs) 

(1)  T$.'s>  =  l  Y 

k  =  1  K 

and  the  proof  can  be  completed  easily;  see  Ross  (1983  p.  255)  . 


Proposition  2  enables  us  to  study  continuous  time  problems  by  applying 
proposition  1  to  the  discrete  time  problem  obtained  via  uniformization. 


3.  Optimal  Repair  of  a  Series  System. 

At  any  point  in  time  the  state  of  the  system  is  specified  by  a  vector 
x  =  (xi,...,x\j)  with  the  convention  that  Xj  :  1  or  0  according  to  whether  the 
i1*1  component  is  functioning  or  not  and  a  scalar  9  which  denotes  the  state 
of  the  environment.  Thus,  S  =  {0, 1)^x0  is  the  set  of  all  possible  states  and 
W  =  {(1,9)  ,  9  e  9)  is  the  set  of  all  functioning  states  for  the  system. 

Given  a  state  (x,9)  e  S  ,  we  define: 

Co(x,9)  =  {i  ;  xi  =  0}  , 

Cj (x, 9 )  =  (i  :  xi  =  1}  , 

(<5i,x,9's  =  ( (xi, .  . .  ,Xi-i,<5,Xi-i-i . xj\j);9)  ,  for  <5  =  0  or  1  , 

a(x:9)  =  minfpiO'i  :  i  €  Cq(x:9''}  , 


p(x; 9  -  xiHi(e)  > 

q ( 0 )  =  l  r  q(9'/0)  . 

& 

Assumption  A:  we  assume  that  if  j  *  i  then  (6'  ^  (j^  9 '  ,  i  9  e  Q 
This  in  particular  implies  that  a ; x , 9 )  =  a(x>  V  (x,9:  €  S. 


When  the  system  is  in  state  (x,9)  the  set  of  all  possible  actions  can  be 
identified  with  the  set  of  the  failed  components  with  the  interpretation  that 
action  i  e  Co(x;0)  means  that  the  repairman  is  assigned  to  component  i  . 
When  the  system  is  in  state  (x:0;m)  and  action  i  e  Cq(x:9)  is  chosen  the 
following  transitions  are  possible 


i  ) 

to 

state 

f  L  ,  x :  8 ' 

,  with  rate  X.  , 

l 

ii  ) 

to 

state 

:  0.  .  x:  0  f 
k 

,  with  rate 

,  k  e  C^(x; 0 ) 

iii 

)  to 

state 

fx; 9' )  , 

with  rate  q(0'/0) 

,  0,  0'  e  0  . 

The 

discrete  time 

decision  problem  induced  by  the  above  is  defined  on 

the 

same  state 

space  with  the  following  transition  law.  When  the  system  is  in 

state  ! 

x:  0 :  m 

and  action  i  €  Cq(x:0)  is 

chosen  the  following  transitions 

are 

possible. 

i) 

to 

state 

(1. ,x:0) 

l 

,  with  probability 

X.  /  v  , 

l 

ii) 

to 

state 

( 0^,x; 9 ) 

,  with  probability 

(1^(0)  /  v  ,  k  e  C^(x;0)  , 

iii')  to  state  fx;9')  ,  with  probability  q(0'/0'/v  ,  8,  0'  e  8  , 
iv)  to  state  (x;0)  ,  with  probability  (v  -  p('x;9)  -  X^  -  q(0))/v  , 


3.1  Construction  of  a  family  of  Markovian  Decision  Problems.  We  construct  a 
family  of  discrete  time,  finite  horizon  markovian  decision  models  as  follows. 
For  any  n  positive  integer  construct  problem  (nn)  by  defining: 

1.  States:  (x:0;m)  ,  (x;0)  €  0  ,  m  =  0,1,..., n 

2.  Action  sets:  A(x;9:m)  =  Cgfx;0) 

3.  System  dynamics:  when  the  system  is  in  state  (x:0:m)  and  action 
i  €  Cgfx:8)  is  chosen  the  following  transitions  are  possible 


i) 

to 

state 

( 1 . ,x:0:m-l) 

l 

,  with  probability  X^  /  v  , 

ii' 

to 

state 

i 0^, x: 0 ; m-1 ) 

,  with  probability  |i^f0)  v  ,  k  e  C 

iii) 

to 

state 

('  x :  0 ' :  m- 1 )  , 

with  probability  q>0'  0'  /  v  ,  9  ,  0'  e 

L  cr  LiH.11 
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iv';  to  state  <x:8:m-l)  ,  with  probability  (v  -  p(x:0'  -  X^  -  q(d))  /  v  , 
where  v  is  any  constant  greater  than  the  sum  of  all  transition  rates. 

4.  Reward  structure: 


rix;6:m)  = 


1  if  fx:8)  €  W  and  m  -  0 


0  otherwise  . 


3.2.  The  SFR  policy  is  stochastically  optimal.  Take  n  and  denote  by  V(x:0:m; 
the  value  of  state  (x;0:m)  in  the  finite  horizon  problem  ; nn  .  To  write 
the  functional  equations  for  V(x;0:m)  ,  m  -  i,...,n  ,  we  first  define 


AC x; 0 ; m: i)  = 


H(x;0)  -  X  -  q ( 0  : )  V  x : 0 : m- 1 ! 


+  X .  V(  1 .  .x: 0: m-1 '  *  I',  x,  u.  0  V f 0,  ,x:8: m-1 : 
li  1  k-  k  k 


(8)  V(x;0:m)  =  max  {A(x; 0 ; m; i) 1 

ieC0 (x; 0 ) 


(9)  V(x; 0 : 0)  = 


1  if  lx; 8)  €  W 


0  otherwise  . 


Lemma  3.  Let  I  x;  s  N  -  2.  Then  under  assumption  A,for  any  (x;0)  €  S 
i,  j  €  Co(x;0),  mil  and  j  i  i  the  following  inequalities  hold 

(10)  (  X.-  X.  )  VCx; 0 ; m)  i  X . V( 1 . ,x; 0 :m)  -  X  .V( 1  . , x: 0 : a ) 

ij  l  i  JJ 

111)  V(x; 8 : m)  4  V(  1  ,x;0:m)  ,  1  4  i  4  N  . 


Proof.  We  prove  the  above  inequalities  simultaneously  by  induction  on  m 
Throughout  the  proof  we  use  the  quantities:  L 'x:8:m'  defined  by: 

(12)  L .  .  (x:d:m)  =  (  X  .  -  X  .  )  Vf  x:  8 :  m)  -  X  .  V  ( 1 .  ,x:  8 :  m '  -X.V(l.,x:0:a:  , 
i.J  i  J  ii  J  ) 

The  induction  hypothesis  is  expressed  by  < 13),  (14'  below 
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with  j  i  i 


(13)  L^y.x:d;mi  ±  0  ,  for  all  x.  0,  i,  ,j 


and 


(14)  V(x;0:m)  i  V(l^,x;9:m)  ,  1  £  i  *  N 


(13)  and  (14)  are  obviously  true  for  m  =  1.  We  next  complete  the  induction  as 
follows. 

Step  1:  we  first  show  that  the  following  inequalities  hold. 


(15)  (X.  -  X.H(v  -  X  ,  p(x:8)  -  q(d ) )  Vt'x:  0 ;  m)  +  X  .  ,V(1  ,  X,x:0;m))  * 

l  j  a(x)  ^  ‘  a(x)  a(x) 


i  Xi((v  -  Xa(x)~  ^i(9)  ~  tx(x;9;)  -  q(0 ) ) V( 1 . ,x; 0 ;m) 
+  X  ,  . V(l. ,1  .  .,x;0:m)+  p. (0)V(x:0:m) ) 

3  <  X  /  X  3  (  X  y  X 

-  X.((v  -  Pj(9)  ~  P;,x:8:)  -  q(0) )  V(1  j ,  x;  0  :  m) 


X.,  nV(1,.  . ,l.,x:0:m)  +  p  .  (9)V(x:0:m) ) 
l(x)  *(x)’  j  j  ' 


where  *(x)  =  [ 


i  if  j  =  a(x) 


a(x)  if  j  ^  a(x) 

To  establish  (15)  we  first  note  that 


(16)  -  X.L.  .  (x:0;m)  =  (  X.-  X.  )  (  -  X.V(x;0;m)  +  X .V( 1  . ,x; 0 : m )  ) 

J  i.J  1  J  J  J  J 


-  X. (  -  X  .V(l . ,x; 0 ;m)  +  X  .V( 1 . , 1  . ,x; 0 ; m)  ) 
i  j  i  j  i  j 


+  X .(  -  X . V( 1  . , x; 0 : m)  +  X . V( 1 . , 1  . , x: 0 : m )  ) 
J  l  j  i  i  j 


We  also  note  that  from  (13)  we  have 


(17)  X  ,  L.  .(1  ,  ,x:0)  *  0 

a(x)  x.j  atx) 


Next  (13)  implies  that 


(18)  (  v  -  X  ,  -  p  .  (8  )  -  p(x: 8)  -  q(0 ) ) L.  .(x: 0: m )  $  0 

aix)  ,j  r  i.j 


which  leads  to 


19)  (  v  -  u(x:0)  -  qi 8 ) ) L.  .  [ x: 0 ; m )  -  X  L.  .  (x:0:m)  % 

ij  a  xj 
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s  ; 6  -X. (  V(x;d:m)  -  V( 1 . , x; 0: m ; )  -  u  .(0)\  .(  V(x:d;m)  -  V( l  . ,x: 9: m) ) 

J  i  i  J  J  J 

4  p.(0;X.(  V(x;0;m)  -  V(  1 .  ,x;  8:m) )  -  [i.(0iX.(  V(x;0:mi  -  V(l.,x:0;m)) 

ii  i  J  J 

Where  the  last  inequality  follows  from  the  induction  hypothesis  (11)  and  the 

fact  that  u.''0)  4  u.(0)  . 

J  i 

Using  (16)  or  (17)  in  (19)  (according  to  whether  a(x)  =  j  or  a(x)  -  j ;  we 
obtain  (15).  after  simple  operations. 

Step  2:  we  next  show  that  the  following  inequality  holds. 

(20)  (X.  -  \.)  rJJ=1xk(ik(0)V(°k.x:e:m) 

4  X^  rk_1xkpk(0}V(l.,Ok,x:0:m)  -  X^  Zk_^xkpk(8 ’ V( 1  . , 0k,x: 0; m) 

To  prove  (20)  it  suffices  to  note  that  (10)  implies  that:  ' 

(21)  L^(Ok,x;0:m)  4  0  ,  for  all  x,  0,  i,  j 
Step  3:  the  next  inequality  is  established. 

(22)  (X.  -  X.)  X  q(0'/0)V(x:0' :m) 

1  j  O 

4  X.  Ia,q:0V9)V(l.  ,x;0';m)  -  X.  L,qi07e)V(l.Ix;0':m) 
i  0  1  i  j  0  j 

To  prove  (22)  it  suffices  to  note  that  (10)  implies  that: 

(23)  Xfl,q(0'/9)  L.  .(x:0';m)  4  0  ,  for  all  x,  0',  i,  j 

B  ij 

Notice  that  for  (23)  to  hold  it  essential  that  the  failure  rates  of  the  compo¬ 
nents  of  the  system  have  the  same  ordering  for  all  0  (assumption  A)  and 
that  the  repair  rates  are  independent  of  0  . 

Step  4:  we  next  complete  the  induction  step  for  inequalities  (10)  .  j 

We  multiply  both  sides  of  inequalities  (15),  (20),  (22)  by  ^  and  add  ! 

them.  Thus,  after  some  simple  algebra  we  obtain  the  following  inequality.  ! 

! 

I 


(24) 


( X  .  —  X  . )  A(  x;  6  :  m:  a(x) )  4  X  .  A(  1 .  ,x:  0 :  m:  a(x) ) 


X  .A:  1  . ,x: 0: m: i  x) ) 


Now  observe  that  the  following  relations  hold  due  to  the  inductive  hypothesis. 


('251  A'x;8:m:a(x) )  =  V(x;0:m+1) 


( 26)  A(l^,x:0:m;a(x))  =  V( 1^ , x; 0 : m+1 ) 


(27)  i\(  1  . ,  x:  0 :  m;  t( x) )  i  VC  1  . , x;  0;  m+1 ) 

J  J 

Therefore,  (10)  holds  for  m  +  1  also. 

Step  5:  we  complete  the  induction  step  for  relation  (11) 


f  aui 

Let  b(x)  =  j 

1  aix) 


a(l;  x)  if  i  =  a(x) 
if  i  ^  a(x) 


After  simple  computations  in  (11)  for  m+1  the  reader  may  check  that  it 
suffices  to  establish 

(28)  (v  -  \  ,  ,-u.(0)  -  p(x:8)  -  q(d ) ) (V(l . ,x:d:m)  -  V(x:0:m))  i 
a( x)  r  l 


,  ,(V(1  ,  ,,x;0;m)  -  V(l,  ,  ,,  l.,x:0:m))  + 
b(x)  a(x)  b(x)  i 


rk=lVk'0KV(Ok’x;0:m)  _  V(li,Ok>x;0;m) 


L,q(078)(Vfx;0';i)  -  V(l.,x;0';m) 
o  i 


Now,  (28)  holds  since,  from  the  inductive  hypothesis,  the  left  hand  side  of 
(28)  is  nonnegative  while  the  right  hand  side  is  nonpositive:  note  that  the 
first  term  in  the  right  hand  side  of  (28)  is  always  a  difference  of  the  type 
covered  by  the  induction  hypothesis  (14). 

Theorem  1.  Under  the  assumptions  made,  the  SFR  policy  minimizes 
stochastically  the  time  the  system  spends  under  repair  . 

Proof.  The  previous  lemma  shows  that  the  SFR  policy  is  optimal  for  all  (nn) 
n  \  1  .  Hence  the  result  follows  from  Propositions  1  and  2  .  | 


4.  Scheduling  Jobs  to  Minimize  The  Makespan.  The  particular  problem  we 
examine  is  that  considered  in  Van  der  Heyden  (1981).  The  state  space  for  this 
problem  is:  S  =  (x  :  x  =  {xi,x2,...,xm)  ,  m  =  1,...  ,  xj  e  (0,B)  }  u  0  , 
where  state  0  is  the  empty  state  for  the  system  ,  x{  denotes  the 

processing  rate  of  the  i-th  job  in  the  system  (waiting  or  being  processed), 
and  B  is  some  bound.  The  action  set  in  state  x  ,  A(x)  ,  contains  all 

subsets  of  x  that  contain  at  most  p  elements. 

After  uniformization,  the  functional  equations  for  the  finite  horizon 
problem  we  consider  can  be  writen  as  : 

^29'  V(x: m-1 ;  =  max  .  ff£.  V(x\{x  .}  :m)x  .  *  r  E'vfxu  ■' vx  :m)  1  /v)  ,  1  s  m  i  n 

aeAtx)  '  j£ a  j  j  . 

with  boundary  condition, 


(30)  V ( x : 0 )  = 


1  if  x  =  0 


0  otherwise  . 


where  the  expectation  is  taken  with  respect  to  the  random  processing  rate  y 
chosen  from  the  distribution  F. 

We  can  readily  establish  the  following, 

Theorem  2.  The  LEPT  policy  is  stochastically  optimal. 

Proof:  To  establish  the  stochastic  optimality  of  the  LEPT  policy,  it  suffices 
to  show  that  it  is  optimal  in  the  finite  horizon  problem  defined  above  for  all 
n  .  The  proof  of  this  is  equivalent  to  establishing  inequalities  which  are 
analogous  to  those  in  part  3  of  Van  der  Heyden  (1981),  i.e.  inequalities 
(3.2)n  ,  ( 3. 3 ) n  ,  ( 3.4 ) n  but  with  the  inequality  sign  reversed.  This  can  be 
done  by  examining  all  cases  as  in  Van  der  Heyden  (1981)  and  following  exactly 
the  same  steps  but  with  symmetric  arguments. 
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